Extracting Lexical Reference Rules from Wikipedia
نویسندگان
چکیده
This paper describes the extraction from Wikipedia of lexical reference rules, identifying references to term meanings triggered by other terms. We present extraction methods geared to cover the broad range of the lexical reference relation and analyze them extensively. Most extraction methods yield high precision levels, and our rule-base is shown to perform better than other automatically constructed baselines in a couple of lexical expansion and matching tasks. Our rule-base yields comparable performance to WordNet while providing largely complementary information.
منابع مشابه
Mining Wikipedia for Large-scale Repositories of Context-Sensitive Entailment Rules
This paper focuses on the central role played by lexical information in the task of Recognizing Textual Entailment. In particular, the usefulness of lexical knowledge extracted from several widely used static resources, represented in the form of entailment rules, is compared with a method to extract lexical information from Wikipedia as a dynamic knowledge resource. The proposed acquisition me...
متن کاملLearning a Lexical Simplifier Using Wikipedia
In this paper we introduce a new lexical simplification approach. We extract over 30K candidate lexical simplifications by identifying aligned words in a sentencealigned corpus of English Wikipedia with Simple English Wikipedia. To apply these rules, we learn a feature-based ranker using SVMrank trained on a set of labeled simplifications collected using Amazon’s Mechanical Turk. Using human si...
متن کاملFor the sake of simplicity: Unsupervised extraction of lexical simplifications from Wikipedia
We report on work in progress on extracting lexical simplifications (e.g., “collaborate” → “work together”), focusing on utilizing edit histories in Simple English Wikipedia for this task. We consider two main approaches: (1) deriving simplification probabilities via an edit model that accounts for a mixture of different operations, and (2) using metadata to focus on edits that are more likely ...
متن کاملبهبود شناسایی موجودیتهای نامدار فارسی با استفاده از کسره اضافه
Named entity recognition is a process in which the people’s names, name of places (cities, countries, seas, etc.) and organizations (public and private companies, international institutions, etc.), date, currency and percentages in a text are identified. Named entity recognition plays an important role in many NLP tasks such as semantic role labeling, question answering, summarization, machine ...
متن کاملExtracting Context-Rich Entailment Rules from Wikipedia Revision History
Recent work on Textual Entailment has shown a crucial role of knowledge to support entailment inferences. However, it has also been demonstrated that currently available entailment rules are still far from being optimal. We propose a methodology for the automatic acquisition of large scale context-rich entailment rules from Wikipedia revisions, taking advantage of the syntactic structure of ent...
متن کامل